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Abstract. An important aspect of modeling spatially-referenced data is 
appropriately specifying the covariance function of the random field. A 
practitioner working with spatial data is presented a number of choices 
regarding the structure of the dependence between observations. One of 
these choices is determining whether or not an isotropic covariance func¬ 
tion is appropriate. Isotropy implies that spatial dependence does not de¬ 
pend on the direction of the spatial separation between sampling locations. 
Misspecification of isotropy properties (directional dependence) can lead 
to misleading inferences, e.g., inaccurate predictions and parameter es¬ 
timates. A researcher may use graphical diagnostics, such as directional 
sample variograms, to decide whether the assumption of isotropy is rea¬ 
sonable. These graphical techniques can be difficult to assess, open to sub¬ 
jective interpretations, and misleading. Hypothesis tests of the assumption 
of isotropy may be more desirable. To this end, a number of tests of direc¬ 
tional dependence have been developed using both the spatial and spectral 
representations of random fields. We provide an overview of nonparametric 
methods available to test the hypotheses of isotropy and symmetry in spa¬ 
tial data. We summarize test properties, discuss important considerations 
and recommendations in choosing and implementing a test, compare sev¬ 
eral of the methods via a simulation study, and propose a number of open 
research questions. Several of the reviewed methods can be implemented 
in R using our package spTest, available on GRAN. 

MSC 2010 subject classifications: Primary 62M30, ; secondary 62G10. 
Key words and phrases: isotropy, symmetry, nonparametric spatial covari¬ 
ance. 


1. INTRODUCTION 

Early spatial models relied on the simplifying assumptions that the covariance 
function is stationary and isotropic. With the emergence of new sources of spatial 
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data, for instance, remote sensing via satellite, climate model output, or environ¬ 
mental monitoring, a variety of methods and models have been developed that 
relax these assumptions. In the case of anisotropy, there are a number of methods 
for modeling both zonal anisotropy (Journel and Huijbregts, 1978, pg. 179-184; 
Ecker and Gelfand, 2003; Schabenberger and Gotway, 2004, pg. 152; Banerjee 
et ah, 2014, pg. 31) and geometric anisotropy (Borgman and Chao, 1994; Ecker 
and Gelfand, 1999). Rapid growth of computing power has allowed the imple¬ 
mentation and estimation of these models. 

When modeling a spatial process, the specification of the covariance function 
will have an effect on kriging and parameter estimates and the associated uncer¬ 
tainty (Cressie, 1993, pg. 127-135). Sherman (2011, pg. 87-90) and Guan et al. 
(2004) use numerical examples to demonstrate the adverse implications of incor¬ 
rectly specifying isotropy properties on kriging estimates. Given the variety of 
choices available regarding the properties of the covariance function (e.g., para¬ 
metric forms, isotropy, stationarity) and the effect these choices can have on in¬ 
ference, practitioners may seek methods to inform the selection of an appropriate 
covariance model. 

A number of graphical diagnostics have been proposed to determine isotropy 
properties. Perhaps the most commonly used methods are directional semivari- 
ograms and rose diagrams (Matheron, 1961; Isaaks and Srivastava, 1989, pg. 149- 
154). Banerjee et al. (2014, pg. 38-40) suggest using an empirical semivariogram 
contour plot to assess isotropy as a more informative method than directional 
sample semivariograms. Another technique involves comparing empirical esti¬ 
mates of the covariance at different directional lags to assess symmetry for data 
on gridded sampling locations (Modjeska and Rawlings, 1983). One caveat of the 
aforementioned methods is that they can be challenging to assess, are open to 
subjective interpretations, and can be misleading (Guan et al., 2004) because 
they typically do not include a measure of uncertainty. Experienced statisticians 
may have intuition about the interpretation and reliability of these diagnostics, 
but a novice user may wish to evaluate assumptions via a hypothesis test. 

Statistical hypothesis tests of second order properties can be used to supple¬ 
ment and reinforce intuition about graphical diagnostics and can be more objec¬ 
tive. Like the graphical techniques, hypothesis tests have their own caveats; for 
example, a parametric test of isotropy demands specification of the covariance 
function. A nonparametric method for testing isotropy avoids the potential prob¬ 
lems of misspecification of the covariance function and the requirement of model 
estimation under both the null and alternative hypothesis, which can be compu¬ 
tationally expensive for large datasets. Eurthermore, nonparametric methods do 
not require the common assumption of geometric anisotropy. However, in aban¬ 
doning the parametric assumptions about the covariance function, implementing 
a test of isotropy presents several challenges (see Section 5). A nonparametric test 
of isotropy or symmetry can serve as another form of exploratory data analysis 
that supplements graphical techniques and informs the choice of an appropriate 
nonparametric or parametric model. Eigure 1 illustrates the process for assessing 
and modeling isotropy properties. 

In this article we review nonparametric hypothesis tests developed to test the 
assumptions of symmetry and isotropy in spatial processes. We summarize tests 
in both the spatial and spectral domain and provide tables that enable convenient 
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Fig 1: A flow chart illustrating the process of determining and modeling isotropy 
in spatial data. The gray boxes indicate the focus of this paper. 
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comparisons of test properties. A simulation study evaluates the empirical size 
and power of several of the methods and enables a direct comparison of method 
performance. The simulations also lead to new insights into test performance 
and implementation beyond those given in the original works. Finally, we include 
graphics that demonstrate considerations for choosing a nonparametric test and 
illustrate the process of determining isotropy properties. 

The remainder of this article is organized as follows: Section 2 establishes nota¬ 
tion and dehnitions; Section 3 details the various nonparametric hypothesis tests 
of isotropy and symmetry and includes Tables 1-3 which facilitate comparison 
between tests as well as test selection for users; Section 4 describes the simula¬ 
tion study comparing the various methods; Sections 5 and 6 provide discussion 
and conclusions. Additional details on the simulation study are specified in the 
Appendix. 


2. NOTATION AND DEFINITIONS 

Here we briefly review key definitions required for tests of isotropy. For addi¬ 
tional background, see Schabenberger and Gotway (2004). Let {T(s) : s £ V C 
d > 1} be a second order stationary random field (RF). Below we will assume 
that d = 2, although many of the results hold for the more general case of d > 2. 
For a spatial lag h = (/ii,/i 2 ), the semivariogram function describes dependence 
between observatons, Y, at spatial locations separated by lag h and is dehned as 

(2.1) 7(h) = ^Var(y(s-L h) - y(s)). 

The classical, moment-based estimator of the semivariogram (Matheron, 1962) is 

^(‘') = ^^X;|r(s)-r(s + h)l^ 

where the sum is over P(h) = {s : s, s -|- h G D} and |T>(h)| is the number of 
elements in T>(\y). The set T’(h) is the set of sampling location pairs that are 
separated by spatial lag h. The covariance function, C'(h), is an alternative to 
the semivariogram for describing spatial dependence and is given by C'(h) = 
lim||v||_,.oo 7(v) — 7(h) if the limit exists. 

Let {si,..., s„} C P be the finite set of locations at which the random pro¬ 
cess is observed, providing the random vector {Y (si ),... ,Y (s„))^. The sampling 
locations may follow one of several spatial sampling designs: gridded locations, 
randomly and uniformly distributed locations, or a cluster design. Some authors 
make the distinction between a lattice process and a geostatistical process ob¬ 
served on a grid (Fuentes and Reich, 2010; Schabenberger and Gotway, 2004, 
pg. 6-10). We do not explore this distinction further and will use the term grid 
throughout. 

It is often of interest to infer the effect of covariates on the process, deduce de¬ 
pendence structure, and/or predict Y with quantifiable uncertainty at new loca¬ 
tions. To achieve these goals, the practitioner must specify the distributional prop¬ 
erties of the spatial process. A common assumption is that the finite-dimensional 
joint distribution is multivariate normal (MVN), in which case we call the RF 
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a Gaussian random field (GRF). We are interested in second order properties; 
thus, hereafter we assume that the mean of the RF is zero. 

A common simplifying assumption on the spatial dependence structure is that 
it is isotropic. 

Definition 2.1. A second order stationary spatial process is isotropic if the 
semivariogram, 7(h), [or equivalently, the covariance function C{h.)] of the spatial 
process depends on the lag veetor h only through its Euclidean length, h = ||h||, 
i.e., 7(h) = 7o(/i) for some function 7o(-) of a univariate argument. 

Isotropy implies that the dependence between any two observations depends only 
on the distance between their sampling locations and not on their relative orien¬ 
tation. A spatial process that is not isotropic is called anisotropic. Anisotropy is 
broadly categorized as either geometric or zonal (Zimmerman, 1993). In practice, 
if a process is assumed to be anisotropic, it is usually assumed to be geometri¬ 
cally anisotropic due to its precise formal and functional dehnition (Ecker and 
Gelfand, 1999). Geometric anisotropy is governed by a scaling parameter, R, and 
rotation parameter, 0, and implies the anisotropy can be corrected via a linear 
transformation of the lag vector or, equivalently, the sampling locations (Cressie, 
1993, pg. 64). 

Symmetry is another directional property of the covariance (semivariogram) 
function, which is often used to describe the spatial variation of processes on a 
grid. We discuss symmetry properties here as they are a subset of isotropy, and 
methods for testing isotropy can often be used to test symmetry. The following 
definitions come from Lu and Zimmerman (2005) and Scaccia and Martin (2005) 
where the notation C{hi,h 2 ) denotes the covariance between random variables 
located hi columns and /i 2 rows apart on a rectangular grid, denoted 

Definition 2.2. A second order stationary spatial process on a grid is reflec¬ 
tion or axially symmetric if C{hi,h2) = C{—hi,h2) for all (/ii,/i2) G 

Definition 2.3. A second order stationary spatial process on a grid is diag¬ 
onally or laterally symmetric if C{hi, h2) = C{h2,hi) for all (/ii,/i2) G 

Definition 2.4. A second order stationary spatial process on a grid is com¬ 
pletely symmetric if it is both reflection and laterally symmetric, i.e., C{hi,h2) = 
C{-hi, /12) = C{h2, hi) = C{-h2, hi) for all {hi, /12) G 


Complete symmetry is a weaker property than isotropy. Isotropy requires that 
C{hi, / 12 ) depends only on for all hi,h2. The relationship between these 

properties is given by: 


. , , , , reflection symmetry 

(2.3) isotropy =;> complete symmetry => . 

diagonal symmetry 

Thus, rejecting a null hypothesis of reflection symmetry implies evidence against 
the assumptions of reflection symmetry, complete symmetry, and isotropy. How¬ 
ever, failure to reject a null hypothesis of reflection symmetry does not imply an 
assumption of complete symmetry or isotropy is appropriate. 
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The aforementioned properties of isotropy and symmetry were defined in terms 
of examining the spatial random process in the spatial domain, where second 
order properties depend on the spatial separation, h. Alternatively, a spatial 
random process can be represented in the spectral domain using Fourier analysis. 
For the purposes of investigating second order properties, we are interested in 
the spectral representation of the covariance function, called the spectral density 
function and denoted /(^), where ld = {uji,uj 2 )- Under certain conditions and 
assumptions (Fuentes and Reich, 2010, pg. 62), the spectral density function is 
given by 

(2.4) f[u) = —^ [ ex.p{-iu'^h)C{h)dh, 

[27Ty J^2 

so that the covariance function, C'(h), and the spectral density function, /(a;), 
form a Fourier transform pair. 

Properties of the covariance function imply properties of the spectral density. 
For example, if the covariance function is isotropic (2.1), then the spectral den¬ 
sity (2.4) depends on u only through its length, u = ||u;||, and we can write 
/(cj) = fo{uj), where /o(-) is called the isotropic spectral density (Fuentes, 2013). 
Consequently, second order properties of a second order stationary RF can be 
explored via either the covariance function or the spectral density function. Test 
statistics quantifying second order properties can be constructed using the pe- 
riodogram, an estimator of (2.4) and denoted by /(•). For a real-valued spatial 
process on a rectangular grid 7? C 'S?, a moment-based periodogram used to 
estimate (2.4) is 


^ ni-l 712-1 

(2.5) I{UJI,UJ2) = . .2 E E C{hi,h2) cos{hiuJi + h2UJ2) , 

' ' hi=—ni+l h2=—712 + 1 

where ni and n 2 denote the number of rows and columns of the grid and C{hi, / 12 ) 
is a non-parametric estimator of the covariance function. In practice, the peri¬ 
odogram (2.5) is used to estimate the spectral density at the Fourier or harmonic 
frequencies. The frequency u = (a;i,a; 2 ) is a Fourier or harmonic frequency if ujj 
is a multiple of 27r/nj, j = 1,2. Furthermore, the set of frequencies is limited to 
{uij = 2'Kkj/nj, kj = 1,2,... ,n*}, where n* is (nj — l)/2 if nj is odd and nj/2 — 1 
if rij is even. 

3. TESTS OF ISOTROPY AND SYMMETRY 
3.1 Brief History 

Matheron (1961) developed one of the earliest hypothesis test of isotropy when 
he used normality of sample variogram estimators to construct a t^st for 
anisotropy in mineral deposit data. Cabana (1987) tested for geometric anisotropy 
using level curves of random fields. Vecchia (1988) and Baczkowski and Mardia 
(1990) developed tests for isotropy assuming a parametric covariance function. 
Baczkowski (1990) also proposed a randomization test for isotropy. Despite these 
early works, little work on testing isotropy was published during the 1990s, al¬ 
though the PhD dissertation work of Lu (1994) would eventually have an notewor¬ 
thy impact on the literature. Then, in the 2000s, a number of nonparametric tests 
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of second-order properties emerged. Some of the developments used estimates of 
the variogram or covariogram to test symmetry and isotropy properties (Lu and 
Zimmerman, 2001; Guan, 2003; Guan et ah, 2004, 2007; Maity and Sherman, 
2012). These works generally borrowed ideas from two bodies of literature: (a) 
theory on the distributional and asymptotic properties of semivariogram estima¬ 
tors (e.g., Baczkowski and Mardia, 1987; Cressie, 1993, pg. 69-47; Hall and Path, 
1994) and (b) subsampling techniques to estimate the variance of statistics de¬ 
rived from spatial data (e.g., Possolo, 1991; Politis and Sherman, 2001; Sherman, 
1996; Lahiri, 2003; Lahiri and Zhu, 2006). Other nonparametric methods used 
the spectral domain to test isotropy and symmetry (Scaccia and Martin, 2002, 
2005; Lu and Zimmerman, 2005; Puentes, 2005). These works generally extended 
ideas used in the time series literature (e.g., Priestley and Rao, 1969; Priestley, 
1981) to the spatial case. Methods for testing isotropy and symmetry in both the 
spatial and spectral domains, under the assumption of a parametric covariance 
function, have also been developed recently (Stein et ah, 2004; Haskard, 2007; 
Puentes, 2007; Matsuda and Yajima, 2009; Scaccia and Martin, 2011). 

3.2 Nonparametric Methods in the Spatial Domain 

A popular approach to testing second order properties was pioneered in the 
works of Lu (1994) and Lu and Zimmerman (2001) who leveraged the asymptotic 
joint normality of the sample variogram computed at different spatial lags. The 
subsequent works of Guan et al. (2004, 2007) and Maity and Sherman (2012) 
built upon these ideas and are the primary focus of this subsection. Lu (1994) 
details methods for testing axial symmetry. While Lu and Zimmerman (2001), 
Guan et al. (2004), and Maity and Sherman (2012) focus on testing isotropy, 
these methods can also be used to test symmetry. Pinally, Bowman and Gru- 
jeiras (2013) detail a more computational approach for testing isotropy. Both 
Li et al. (2007, 2008b) and Jun and Genton (2012) use an approach analogous 
to the methods from Lu and Zimmerman (2001), Guan et al. (2004, 2007), and 
Maity and Sherman (2012) but for spatiotemporal data. Table 1 summarizes test 
properties discussed in this section and Section 3.3. 

Nonparametric tests for anisotropy in the spatial domain are based on a null 
hypothesis that is an approximation to isotropy. Under the null hypothesis that 
the RF is isotropic, it follows that the values of 7 (-) evaluated at any two spatial 
lags that have the same norm are equal, regardless of the direction of the lags. To 
fully specify the most general null hypothesis of isotropy, theoretically, one would 
need to compare variogram values for an infinite set of lags. In practice a small 
number of lags are specified. Then it is possible to test a hypothesis consisting of 
a set of linear contrasts of the form 


(3.1) Ho : A7(-) = 0 

as a proxy for the null hypothesis of isotropy, where A is a full row rank matrix 
(Lu and Zimmerman, 2001). For example, a set of lags, denoted A, commonly 
used in practice for gridded sampling locations with unit spacing is 


(3.2) A = {hi = (1,0),h2 = (0,1), hg = (1,1),h4 = (-1,1)}, 

and the corresponding A matrix under Hq : A 7 (A) = 0 is 
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One of the first steps in detecting potential anisotropy is the choice of lags, as the 
test results will only hold for the particular set of lags considered (Guan et ah, 
2004). While this choice is subjective, there are several considerations and useful 
guidelines for determining the set of lags (see Section 5). 

For nonparametric tests of symmetry, the null hypothesis of symmetry using 
(3.1) can be expressed by a countable set of contrasts for a process observed 
on a grid. Tests of symmetry will be subject to similar practical considers as 
tests of isotropy, and practitioners testing symmetry properties will need to 
choose a small set of lags and form a hypothesis that is a surrogate for sym¬ 
metry. For example, testing reflection symmetry of a process observed on the 
integer grid would require comparing estimates of C(-) evaluated at the lag pairs 
{( 1 , 0 ), (- 1 , 0 )}, {( 2 , 0 ), (- 2 , 0 )}, {( 1 , 1 ), (- 1 , 1 )}, etc. 

The tests in Lu and Zimmerman (2001), Guan et al. (2004, 2007), and Maity 
and Sherman (2012) involve estimating either the semivariogram, '/(•), or co- 
variance, C(-), function at the set of chosen lags, A. Denoting the set of point 
estimates of the semivariogram/covariance function at the given lags as G^, the 
true values as G, and normalizing constant an, a central result for all three meth¬ 
ods is that 

(3.4) an{Gn-G) MVNiO,'S), 

n—^oo 

under increasing domain asymptotics and mild moment and mixing conditions 
on the RF. Using the A matrix, an estimate of the variance covariance matrix, 
5], and G^, a quadratic form is constructed, and a p-value can be obtained from 
an asymptotic distribution with degrees of freedom given by the row rank of 
A. The primary differences between these works are the assumed distribution of 
sampling locations, the shape of the sampling domain, and the estimation of G 
and X). These differences are important when choosing a test that is appropriate 
for a particular set of data (see Tables 1 and 2 and Figure 4 for more information 
about these differences). 

Maity and Sherman (2012) develop a test with the fewest restrictions on the 
shape of the sampling region and distribution of sampling locations. Their test 
can be used when the sampling region is any convex subset in and the distri¬ 
bution of sampling locations in the region follows any general spatial sampling 
design. The test in Guan et al. (2004) also allows convex subsets in and is de¬ 
veloped for gridded and non-gridded sampling locations but requires non-gridded 
sampling locations to be uniformly distributed on the domain, i.e., generated by 
a homogenous Poisson process. The Poisson assumption is dropped in Guan et al. 
(2007). Lu and Zimmerman (2001) require the domain to be rectangular and the 
observations to lie on a grid. 

Another difference between methods is the form of the nonparametric estima¬ 
tor of G. In Lu and Zimmerman (2001), Gn is computed using the log of the 
classical sample semivariogram estimator (2.2). Guan et al. (2004, 2007) also use 
the estimator in ( 2 . 2 ) for gridded sampling locations, but use a kernel estimator of 
7 (h) for non-gridded locations. Maity and Sherman (2012) use a kernel estimator 
of the covariance function. When smoothing over spatial lags in M^, the kernel 
is typically given as Nadaraya-Watson (Nadaraya, 1964; Watson, 1964) prod¬ 
uct kernel, independently smoothing over horizontal and vertical lags. Gommon 
choices for the kernel are the Epanechnikov or truncated Gaussian kernels. The 
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Table 1: Properties of nonpararaetric tests of isotropy. “Domain” refers to the domain used to represent the RF (spatial or spectral), “Test 
Stat Based On” lists the nonparametric estimator used to construct the test statistic “Distb’n” gives the limiting asymptotic distribution 
of the test statistic, and “GP” denotes whether the test requires data to be generated from a Gaussian process. 
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kernel estimators also require the choice of a bandwidth parameter, w. Choosing 
an appropriate bandwidth is one of the challenges of implementing the tests for 
non-gridded sampling locations, and the conclusion of the test has the poten¬ 
tial to be sensitive to the choice of the bandwidth parameter (see Section 5 for 
recommendations on bandwidth selection). 

Nonparametric tests in the spatial domain also vary in the estimation of S, the 
asymptotic variance-covariance of in (3.4). Lu and Zimmerman (2001) use a 
plug-in estimator, which requires the choice of a parameter, m, that truncates the 
sum used in estimation. Spatial resampling methods are another approach used to 
estimate S. The method used for spatial resampling and properties of estimators 
computed from spatial resampling will depend on the underlying spatial sampling 
design (Lahiri, 2003, pg. 281). Guan et al. (2004, 2007) use a moving window ap¬ 
proach, creating overlapping subblocks that cover the sampling region. Maity and 
Sherman (2012) employ the grid based block bootstrap (GBBB) (Lahiri and Zhu, 
2006). The GBBB approach divides the spatial domain into regions, then replaces 
each region by sampling (with replacement) a block of the sampling domain hav¬ 
ing the same shape and volume as the region, creating a spatial permutation of 
blocks of sampling locations. When using the resampling methods, the user must 
chose the window or block size and the conclusion of the test has the potential to 
change based on the chosen size. Irregularly shaped sampling domains can pose 
a challenge in using the subsampling methods. For example, for an irregularly 
shaped sampling domain, many incomplete blocks may complicate the subsam¬ 
pling procedure. We summarize guidelines for choosing the window/block size in 
Section 5. 

Another approach to testing isotropy in the spatial domain is given by Bowman 
and Grujeiras (2013) who take a more empirical and computationally-intensive 
approach. Their methods are available in the R software (R Core Team, 2014) 
package sm (Bowman and Azzalini, 2014). One caveat of using the sm package is 
that the methods are computationally expensive, even for moderate sample sizes. 
For example, a test of isotropy on 300 uniformly distributed sampling locations on 
a 10 X 16 sampling domain took approximately 9.5 minutes where the methods 
from Guan et al. (2004) took 1.6 seconds using a laptop with 8 GB of memory 
and a 2 GHz Intel Core i7 processor. Because of the computational costs, we do 
not consider this method further. 

3.3 Nonparametric Methods in the Spectral Domain 

For gridded sampling locations, nonparametric spectral methods have been 
developed for testing symmetry (Scaccia and Martin, 2002, 2005; Lu and Zim¬ 
merman, 2005) and stationarity (Fuentes, 2005), but none are designed with a 
primary goal of testing isotropy. Due to the difficulties presented by non-gridded 
sampling locations, historically there have been fewer developments using spec¬ 
tral methods for non-gridded sampling locations than for gridded (or lattice) 
data, but this is an area that has received more attention recently (see, e.g., 
Fuentes, 2007; Matsuda and Yajima, 2009; Bandyopadhyay et al., 2015). Despite 
the challenges, Van Hala et al. (2014) have proposed a nonparametric, empirical 
likelihood approach to test isotropy and separability for non-gridded sampling 
locations. 

The primary motivation for using the spectral domain over the spatial domain 
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are simpler asymptotics in the spectral domain. Unlike estimates of the variogram 
or covariogram at different spatial lags, estimates of the spectral density at differ¬ 
ent frequencies via the periodogram are asymptotically independent under certain 
conditions (Pagano, 1971; Schabenberger and Gotway, 2004, pg. 78,194). Addi¬ 
tionally, in practice, tests of symmetry in the spectral domain are generally not 
subject to as many choices (e.g., spatial lag set, bandwidth, block size) as those 
in the spatial domain. 

Analogous to testing isotropy in the spatial domain by using a finite set of spa¬ 
tial lags, tests of symmetry in the spectral domain typically involve estimating 
and comparing the spectral density (2.4) at a finite set of the Fourier frequencies. 
For example, axial symmetry (2.2) of the covariance function implies axial sym¬ 
metry of the spectral density, f{uji,uj 2 ) = f{—u}i,uj 2 ), which can be evaluated 
by comparing I{u}i,uj 2 ) to I{—uji,lo 2 ) at a finite set of frequencies. Similarly, the 
null hypothesis of isotropy can be approximated by comparing periodogram (2.5) 
estimates at a set of distinct frequencies with the same norm (Fuentes, 2005). 
Although most of the current spectral methods are not directly designed to test 
isotropy, the hypothesis of complete symmetry can be used to reject the assump¬ 
tion of isotropy due to (2.3). However, certain types of anisotropy may not be 
detected by these tests. For example, a geometrically anisotropic process having 
the major axis of the ellipses of equicorrelation parallel to the x-axis is axially 
symmetric, and the anisotropy wouldn’t be detected by a test of axial symmetry. 

Scaccia and Martin (2002, 2005) use the periodogram (2.5) to develop a test 
for axial symmetry. They propose three test statistics that are a function of the 
periodogram values. The first uses the average of the difference in the log of the 
periodogram values, log[/(a;i,a; 2 )] — log[7(a;i, — 1 <; 2 )]. The second and third test 
statistics use the average of standardized periodogram differences, [/(a;i,a; 2 ) — 
/(cui, —a; 2 )]/[/(a;i, W 2 ) +I{u}i, —a; 2 )]. These test statistics are shown to asymptot¬ 
ically follow a standard normal or t distribution via the Central Limit Theorem, 
and the corresponding distributions are used to obtain a p-value. 

Lu and Zimmerman (2005) also use the periodogram as an estimator of the 
spectral density to test properties of axial and complete symmetry of processes 
on the integer grid, I?. They use the asymptotic distribution of the periodogram 
to construct two potential test statistics. Both test statistics leverage the fact 
that, under certain conditions and at certain frequencies, 

(3.5) ,,1. 

J[LOi,UJ 2) ni,n2^oo 

Under the null hypothesis of axial or complete symmetry, (3.5) implies that ratios 
of periodogram values at different frequencies follow an F(2, 2) distribution. The 
preferred test statistic produces a p-value via a Cramer-von Mises (CvM) good¬ 
ness of fit test using the appropriate set of periodogram ratios. Because rejecting 
a hypothesis of axial symmetry implies rejecting a hypothesis of complete sym¬ 
metry, Lu and Zimmerman (2005) recommend a two-stage procedure for testing 
complete symmetry. At the first stage, they test the hypothesis of axial symmetry, 
and if the null hypothesis is not rejected, they test the hypothesis of complete 
symmetry. To control the overall type-I error rate at a, the tests at each stage 
can be performed using a significance level of a/2. 

Leveraging the asymptotic independence of the periodogram at different fre¬ 
quencies, Van Hala et al. (2014) propose a spatial frequency domain empirical 
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likelihood (SFDEL) approach that can be used for inference about spatial co- 
variance structure. One of the applications of this method is testing isotropy. 
An advantage of this method over other frequency domain approaches is that 
it can be used for non-gridded sampling locations. To implement the test, the 
user must select the set of lags and, because the sampling locations are not grid- 
ded, the number and spacing of frequencies. Van Hala et al. (2014) offer some 
guidelines for these choices based on the simulations and theoretical considera¬ 
tions (e.g., the frequencies need to be asymptotically distant). Once these choices 
are made. Van Hala et al. (2014) maximize an empirical likelihood under a mo¬ 
ment constraint corresponding to isotropy and show that the log-ratio of the con¬ 
strained and unconstrained maximizer asymptotically follows a distribution. 
The SFDEL method relies on the asymptotic independence of the periodogram 
values, and the smallest sample size used in simulations was n = 600. Thus, it is 
not clear how the method will perform for smaller sample sizes. 

Fuentes (2005) introduces a nonparametric, spatially varying spectral density 
to represent nonstationary spatial processes. While the method can be used to 
test the assumption of isotropy, the test requires a large sample size on a hne 
grid. For this reason and also because the test was primarily designed to test the 
assumption of stationarity, we do not consider it further. 

4. SIMULATION STUDY 

We designed a simulation study to compare the empirical size, power, and 
computational costs for four of the methods. For gridded sampling locations, we 
compare Lu and Zimmerman (2005) [hereafter, LZ] to Guan et al. (2004) [hereafter 
denoted as GSC or GSG-g when we are specihcally referring to the test when ap¬ 
plied to gridded sampling locations]. For uniformly distributed sampling locations 
we compare Maity and Sherman (2012) [MS] to Guan et al. (2004, 2007)[GSC-u 
for the method used for uniformly distributed sampling locations]. 

We performed the tests on the same realizations of the RF. Data are sim¬ 
ulated on rectangular grids or rectangular sampling domains because they are 
more realistic than square domains and simulations on rectangular domains were 
not previously demonstrated. We simulate Gaussian data with mean zero and 
exponential covariance functions with no nugget, a sill of one, and effective range 
values corresponding to short, medium, and long range dependence. We introduce 
varying degrees of geometric anisotropy via coordinate transformations governed 
by a rotation parameter 9 and scaling parameter R that dehne the ellipses of 
equicorrelation (see Figure 5 in the Appendix). The parameter 9 quantihes the 
angle between the major axis of the ellipse and the x-axis (counter-clockwise ro¬ 
tation) while R gives the ratio of the major and minor axes of the ellipse. We 
also performed simulations that investigate the effect of the lag set, block size, 
and bandwidth. Although some simulations are given in the original works, our 
simulations serve to provide a direct comparison of the effects of changing these 
values and provide further insight into how to choose them. See the Appendix 
for additional simulation details and results. 

Figures 2 and 3 illustrate a subset of the simulation results compring empirical 
size, power, and computational time (full results in Appendix, Tables 5 and 6). 
These simulations indicate that nonparametric tests for anisotropy have higher 
power for gridded (Figure 5) than for non-gridded (Figure 6) sampling designs. 
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In both comparisons the methods from GSC have favorable empirical power over 
the competitor with a comparable empirical size. As the effective range increases, 
both empirical size and power tend to increase for the methods from GSC, but 
they tend to decrease for MS. GSC-g and LZ have similar computation time, 
while MS is much more computationally expensive than GSC-u. This difference 
is due to the bootstrapping required by MS. 

Unsurprisingly, as the strength of anisotropy increases (measured by R), power 
increases for all the methods. For a geometrically anisotropic process, the major 
and minor axes of anisotropy are orthogonal. In comparing the effect of the orien¬ 
tation of isotropy (9) on the methods, it is important to note that when 9 = 0, the 
major axis of the ellipse defining the geometric anisotropy is parallel to the x-axis 
and corresponds to a spatial process that is axially symmetric but not completely 
symmetric. When 9 = 3tt/8 the major axis of the ellipse forms a 67.5-degree angle 
with the x-axis, and the spatial process is neither axially nor completely sym¬ 
metric (see Figure 5 in the Appendix for contours of equal correlation used in the 
simulation). The original works generally only simulate data from a geometrically 
anisotropic process with the major axis of anisotropy forming a 45-degree angle 
with the x-axis; hence, our simulation study more carefully explores the effect of 
changing the orientation of geometric anisotropy. The methods from GSC exhibit 
higher power when 9 = 0 than when 9 = Svr/S. This is due to the fact that the 
lag set. A, from (3.2) used for the tests contains a pair of spatial lags that are 
parallel to the major and minor axes of anisotropy when 9 = 0, indicating that 
an informed choice of spatial lags improves the test’s ability to detect anisotropy. 
This same result does not hold for MS. It is unclear whether this behavior is ob¬ 
served because the method uses the covariogram rather than the semivariogram, 
the GBBB rather than moving window approach for estimating S, or perhaps 
both. The simulation results indicate that the LZ test has low empirical power; 
however, this method was developed to test symmetry properties on square grids, 
and the choice of a rectangular grid for our simulation study does not allow for a 
large number of periodogram ordinates for the second stage of the procedure for 
testing the complete symmetry hypothesis. 

Results from simulations that investigate the effects of changing the lag set, the 
block size, and the bandwidth for non-gridded sampling locations are displayed 
in Tables 7-9, respectively, in the Appendix. For both GSC-u and MS, the lag 
set in (3.2) provided an empirical size close to the nominal level. Using more 
lags or longer lags decreased the size and power for GSC-u. This may be due 
to the additional uncertainty induced by estimating the covariance between the 
semivariance at more lags and the larger variance of semivariance estimates at 
longer lags. For MS the longer lags lead to an inflated size and more lags decreased 
the power. In this case, the GBBB may not be capturing the uncertainty in 
covariance estimates at longer lags with the chosen block size. The MS test was 
not overly sensitive to block size with larger blocks leading to slightly higher 
power. MS found that an overly large block size was adverse for test size. For 
GSC-u the small and normal sized windows performed at nominal size levels with 
comparable power while larger windows were detrimental to test size and power. 
For GSC-u, we find that choosing a large window tends to lead to overestimation 
of the asymptotic variance-covariance matrix due to fewer blocks being used 
to re-estimate the semi variance. Finally, the results investigating the bandwidth 
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selection for GSC-u indicate that choosing an overly large bandwidth inflates test 
size while choosing too small a bandwidth deflates test size and power. However, 
the results also indicate that, for the small sample size, test size and power are 
less negatively affected when approximating the p-value via the finite sample 
adjustment. 

Weller (2015b) demonstrates applications of several of these methods on two 
real data sets. The R package spTest (Weller, 2015a) implements the tests in LZ, 
GSC, and MS for rectangular grids and sampling regions and is available on the 
Comprehensive R Archive Network (CRAN). 
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GSC-g 
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(1.45s) 


LZ 

(4.99s) 


Fig 2: Empirical size and power for Guan et al. (2004) [GSC-g] and Lu and Zim¬ 
merman (2005) [LZ] for 500 realizations of a mean 0 GRF with gridded sampling 
locations using a nominal level of a = 0.05. Colors and shapes indicate the type 
of anisotropy. Gray points correspond to the isotropic case. The results corre¬ 
spond to a “medium” effective range. Computational time for each method is 
also displayed. 


5. RECOMMENDATIONS 

Based on the simulation results we offer recommendations for implementation 
of nonparametric tests of isotropy. The flow chart in Figure 1 along with Figure 
4 summarize the steps in the process. Tables 1-3 compare the tests. Table 4 
summarizes the recommendations provided below. 

In choosing a nonparametric test for isotropy, the distribution of sampling 
locations on the sampling domain is perhaps the most important consideration. 
Data on a grid simplifies estimation because the semivariogram or covariogram 
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N = 300 N = 450 N = 300 N = 450 



(2.17s) (4.44s) (83.40s) (271.22s) 

Method 

(Time: 1 Test) 


Fig 3: Empirical size and power for Guan et al. (2004) [denoted GU] and Maity 
and Sherman (2012) [denoted MS] for 200 realizations of a mean 0 GRF with 
uniformly distributed sampling locations using a nominal level of a = 0.05. Col¬ 
ors and shapes indicate the type of anisotropy. Gray points correspond to the 
isotropic case. The results correspond to a “medium” effective range. Computa¬ 
tional time for each method is also displayed. 


can be estimated at spatial lags that are exactly observed separating pairs of 
sampling locations. A grid also allows the option of using easily implemented 
tests in the spectral domain. 

Sample size requirements for the asymptotic properties of tests using the spatial 
domain to approximately hold will depend on the dependence structure of the 
random field. GSC note that convergence of their test statistic is slow in the case 
of gridded sampling locations and obtain an approximate p-value via subsampling 
rather than the asymptotic distribution. Tests using the spectral domain rely 
on the asymptotic independence of periodogram values, and correlation in finite 
samples can lead to an inflated test size (LZ). Based on our simulations, we 
recommend the sample size be at least 150 for gridded sampling locations and at 
least 300 for non-gridded sampling locations. However, power tends be low when 
the sample size is small and/or the anisotropy is weak (Figures 2 and 3). 

We focus on implementation of the methods that use the spatial domain for the 
remainder of this section. We discuss the choice of lags, block size, and bandwidth 
for the tests in GSC and MS. Due to the large number of choices required to 
implement the tests (e.g., block size, bandwidth, kernel function, subsampling 
method), features of the random field (e.g., sill, range), and properties of the 
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Fig 4; Spatial sampling design considerations for choosing a nonparametric hy¬ 
pothesis test of isotropy. The method we recommended for testing isotropy in 
each situation is given in bold including LZ = Lu and Zimmerman (2005); SM = 
Scaccia and Martin (2005); GSC-g = Guan et al. (2004) for gridded sampling lo¬ 
cations; GSC-u = Guan et al. (2004) for uniformly distributed sampling locations; 
MS = Maity and Sherman (2012); GSC-n = Guan et al. (2007) for non-uniform 
sampling locations. 


sampling design (e.g., density of sampling locations, shape of sampling domain), 
the recommendations we offer will not apply in all situations. The numerous 
moving parts make it challenging to develop general recommendations, especially 
when choosing a bandwidth. 

When determining the lag set. A, for use in (3.1), the user needs to select 

(a) the norm of the lags (e.g.. Euclidean distance), 

(b) the orientation (direction) of the lags, and 

(c) the number of lags. 

Regarding (a), short lags are preferred. Estimates of the spatial dependence at 
large lags may be less reliable than estimates at shorter lags because they are 
based on a smaller number of pairs of observations and hence more variable. 
Additionally, empirical and theoretical evidence (Lu and Zimmerman, 2001) in¬ 
dicates that values of 7 (-) in two different directions generally exhibit the largest 
difference at a lag less than the effective range, the distance beyond which pairs 
of observations can be assumed to be independent. Finally, there is mathemat¬ 
ical support that correctly specifying the covariance function at short lags is 
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important for spatial prediction (Stein, 1988). Considering (b), if the process is 
anisotropic, the ideal choice of A and A contrasts lags with the same norm but 
oriented in the direction of weakest and strongest spatial correlation. Typically, 
the directions of weakest and strongest spatial correlation will be orthogonal and 
thus, lags contrasted using the A matrix should also be orthogonal. Prior infor¬ 
mation, if available, about the underlying physical/biological process giving rise 
to the data can also be used to inform the orientation of the lags (Guan et ah, 
2004). If no prior information about potential anisotropy is available, lags ori¬ 
ented in the same directions as those in (3.2) are a good starting set. In regards 
to (c), detecting certain types of anisotropy requires a sufficient number of lags 
but using a large number of lags requires a large number of observations (Guan 
et ah, 2004). As a general guideline, we suggest using four lags to construct two 
contrasts. 

Several tests require selection of a window or block size to estimate the variance- 
covariance matrix. The moving window from GSC creates overlapping subblocks 
of data by sliding the window over a grid placed on the region. Each of these sub¬ 
blocks are used to re-estimate the semivariance. The block size from MS defines 
the size of resampled blocks when implementing the GBBB. The GBBB permutes 
resampled blocks to create a new realization of the process over the entire domain. 
Ghoosing the window size in GSG requires balancing two competing goals. First, 
the moving window should be large enough to create subblocks that are represen¬ 
tative of the dependence structure for the entire RF. Second, the window should 
be small enough to allow for a sufficient number of subblocks to re-estimate the 
semivariance, as these values are used to obtain an estimate of the asymptotic 
variance-covariance. A window that is too large or too small can potentially lead 
to under- or over-estimation of the asymptotic variance-covariance. For GSG-u, 
the windows must be large enough to ensure enough pairs of sampling locations 
are in each subblock to compute an estimate of the semivariance without having 
to over-smooth. For gridded sampling locations, GSG demonstrate good empirical 
size and power by using moving windows with size of only 2x2. However, they 
find slow convergence to the asymptotic distribution, and a p-value is instead 
computed by approximating the distribution of the test statistic by computing its 
value for each of the subblocks. Hence, approximating the p-value to two decimal 
places will require at least 100 subblocks over the sampling region. This may not 
be possible in practice. For example, a 12 x 12 grid of sampling locations with 
moving windows of size 2x2 results in only 90 subblocks when correcting for 
edge effects. The challenge of choosing the block size in MS is subject to similar 
considerations as the window size in GSG. The p-value for both tests will change 
when performing the test with different window or block sizes, and the user may 
decide to run the test with different block sizes (e.g., MS). There are a number of 
works on resampling spatial data to obtain an estimate of the variance of a spatial 
statistic (e.g., Sherman, 1996; Politis and Sherman, 2001; Lahiri, 2003; Lahiri and 
Zhu, 2006), but they do not directly consider variance estimation in the case of 
a nonparametric estimate of the semivariogram/covariogram. Denoting the num¬ 
ber of points per block as nb, Sherman (1996) proposes choosing the block size 
such that Ub ~ for a constant, c, when the spatial dependence does not ex¬ 

hibit a large range. In a number of different applications of spatial subsampling, 
c is typically chosen to be between 0.5 and 2 (Politis and Sherman, 2001; Guan 
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et al., 2004, 2006). Based on our simulations, we find acceptable empirical size 
and power for GSC-g using small windows and approximating the p-value with 
the finite sample adjustment. Thus, we recommend setting rib < for GSC-g. 
For example, we used windows with size 3x2 and 5x3 for sampling domains of 
18 X 12 and 25 x 15, respectively. In the case of uniformly distributed sampling 
locations (see Table 8 in the Appendix), the empirical size and power from GSC-u 
was negatively affected by a large moving window size; hence, we recommend set¬ 
ting c = 1 and choosing rib ^ For the MS test, a small block size negatively 
affected the empirical size and power; thus, we recommend choosing rib ^ 
for this test. 

Between the choices of a lag set, block size, and bandwidth, choosing an appro¬ 
priate bandwidth to smooth over observed spatial lags for non-gridded sampling 
locations is the most challenging. For GSC-u the user needs to choose the form 
of the smoothing kernel as well as the bandwidth for both the entire grid and 
the subblocks while MS use an Epanechnikov kernel and empirical bandwidth 
based on a user-specihed tuning parameter. If the selected bandwidth is too large 
then over-smoothing occurs. In oversmoothing, there is very little filtering of the 
lag distance and direction. The lack of filtering produces similar estimates of the 
spatial dependence at lags with different directions and distances. If the selected 
bandwidth is too small, then there is very little smoothing and estimates of the 
spatial dependence are based on a small number of pairs of sampling locations 
and thus highly variable. Considering the aforementioned effects of the band¬ 
width, the bandwidth should decrease as n increases under the usual increasing 
domain asymptotics. For example, simulations (not included) indicated a band¬ 
width of w = 0.65 maintains nominal size when n = 950, but leads to deflated test 
size and power when n = 400 on a smaller domain. Garcfa-Soidan et al. (2004), 
Garcfa-Soidan (2007), and Kim and Park (2012) develop theoretically optimal 
bandwidths for nonparametric semivariogram estimation, but these works are 
not applicable here because they focus on the isotropic case and require an esti¬ 
mate of the second derivative of the variogram. We have found that the empirical 
bandwidth used by MS tends to produce nominal size (see Table 6). For GSC-u we 
hnd the most consistent results with a bandwidth in the range of 0.60 < w < 0.90 
when using a normal kernel truncated at 1.5, but these values will change when 
a different truncation value or kernel function are employed. For small sample 
sizes (n < 500), our simulations demonstrate that test size and power are less 
affected by the choice of bandwidth when the p-value is approximated using a 
hnite sample adjustment, indicating poor convergence to the asymptotic dis¬ 
tribution. Thus, the user should consider using the finite sample adjustment for 
non-gridded sampling locations when the sample size is small there are at least 
100 subblocks. While it is challenging to choose a bandwidth for GSC-u and the 
p-value of the test is sensitive to this parameter, the method exhibits nominal 
size and substantially higher power than MS when chosen appropriately. 

6. CONCLUSIONS 

There are several important avenues of future research. Methods to more for¬ 
mally characterize the optimal block size and bandwidth parameters for the tests 
in the spatial domain would enhance the applicability of the tests. The perfor¬ 
mance of the tests for non-gridded data in Guan et al. (2004) and Maity and 
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Sherman (2012) are sensitive to these choices and their optimality remains an 
open question. Zhang et al. (2014) develop a nonparametric method for estimat¬ 
ing the asymptotic variance-covariance matrix of statistics derived from spatial 
data that avoids choosing tuning parameters which could simplify test implemen¬ 
tation. A second area of future work is further development of nonparametric tests 
of isotropy for gridded and non-gridded data in the spectral domain. A third area 
of further investigation is to compare nonparametric to parametric methods for 
testing isotropy, e.g., Scaccia and Martin (2011). A final area of future work is 
development of a formal definition and more careful quantification of power of 
the tests. For example, the degree of geometric anisotropy could be quantified 
using different characteristics of the covariance function, including the ratio of 
the major and minor axes of the ellipse, degree of rotation of the ellipse from 
the coordinate axes, and range of the process. Furthermore, it is important to 
consider the effects of density and design of sampling locations, sample size, and 
the amount of noise (nugget and sill) in the observations on a test’s ability to 
detect anisotropy. 

There is a volume of work on tests for isotropy in other areas of spatial statis¬ 
tics. Methods for detecting anisotropy in spatial point process data have been 
developed, e.g., Schabenberger and Gotway (2004, pg. 200-205), Guan (2003), 
Guan et al. (2006), and Nicolis et al. (2010). For multivariate spatial data, Jona- 
Lasinio (2001) proposes a test for isotropy. Gneiting et al. (2007) provide a review 
of potential second order assumptions and models for spatiotemporal geostatis- 
tical data, and a number of tests for second order properties of spatiotemporal 
data have been developed, e.g., Fuentes (2006), Li et al. (2007), Park and Fuentes 
(2008), Shao and Li (2009), Jun and Genton (2012). Li et al. (2008a) construct 
a test of the covariance structure for multivariate spatiotemporal data. Tests 
for isotropy have also been developed in the computer science literature (e.g., 
Molina and Feito, 2002; Chorti and Hristopulos, 2008; Spiliopoulos et ah, 2011; 
Thon et ah, 2015). 

Appropriately specifying the second order properties of the random field is 
an important step in modeling spatial data, and a number of models have been 
developed to capture anisotropy in spatial processes. Graphical tools, such as 
directional sample semivariograms, are commonly used to evaluate the assump¬ 
tion of isotropy, but these diagnostics can be misleading and open to subjective 
interpretation. We have presented and reviewed a number of procedures that 
can be used to more objectively test hypotheses of isotropy and symmetry with¬ 
out assuming a parametric form for the covariance function. These tests may be 
helpful for a novice user deciding on an appropriate spatial model. In abandoning 
parametric assumptions, these hypothesis testing procedures are subject and sen¬ 
sitive to choices regarding smoothing parameters, subsampling procedures, and 
finite sample adjustments. The test that is most appropriate for a set of data will 
largely depend on the sampling design. Additionally, there are trade-offs between 
the empirical power demonstrated by the tests and the number of choices user 
must make to implement the tests (e.g., between Guan et al. (2004) and Maity 
and Sherman (2012)). We have offered recommendations regarding the various 
choices of method and their implementation and have made the tests available in 
the spTest software. Because of the sensitivity of the tests to the various choices, 
we believe that graphical techniques and nonparametric hypothesis tests should 
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be used in a complementary role. Graphical techniques can provide an initial in¬ 
dication of isotropy properties and inform sensible choices for a hypothesis test, 
e.g., in choosing the spatial lag set, while hypothesis tests can affirm intuition 
about graphical techniques. 
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APPENDIX 

SIMULATION STUDY DETAILS AND FURTHER RESULTS 


We define the isotropic exponential covariance function as 


( 6 . 1 ) 


C{h) = 


a‘^ exjp{—(j)h) if/i > 0, 
otherwise 


where h = ||sj — Sj|| is the distance between sites Sj and Sj (Irvine et ah, 2007). 
The corresponding semivariogram is 'y{h) = (r^ + cr^) — a'^ exjp{—(j)h), where 
is the nugget, is the sill, and the effective range, the distance beyond 

which the correlation between observations is less than 0.05, is 

— 1 / 



Simulations in Section 4 were performed using the exponential covariance function 
(6.1) with a partial sill, cr^, of 1 and no nugget, = 0. We also performed 
simulations using different nugget values (results not included). As expected, 
introducing a nugget had an adverse effect on empirical test size and power. For 
the no nugget simulations, effective ranges, for isotropic processes were chosen 
to be 3, 6, and 12 corresponding to short, medium, and long range dependence. 
Geometric anisotropy was introduced by transforming the sampling locations 
according to a scaling parameter, R, and a rotation parameter, 9. Given an {R, 9) 
pair, the coordinates (x, y) are transformed to the “anisotropic” coordinates, 
ixa,ya) via 

'1 O' 

[o r \' 

A realization from the anisotropic process is then created by simulating using the 
distance matrix from the transformed coordinates and placing the observed values 
at their corresponding untransformed sampling locations. Figure 5 shows the 
isotropic exponential correlogram corresponding to = 1 and .^ = 6 and contours 
of equicorrelation corresponding to the {R, 9) values used in the simulation study. 
Note that a larger value of R corresponds to a more anisotropic process. 

For the simulations comparing the GSG-g and LZ (Lu and Zimmerman, 2005) 
tests in Table 5, data were simulated on a subset of the integer grid, I?. The 
p-values for the GG test were approximated using a finite sample statistic (Guan 
et ah, 2004), and we used the lag set in (3.2) and A matrix in (3.3). For the 
results involving the LZ test, a test of complete symmetry was performed as an 
approximation to the null hypothesis of isotropy. The p-values for the LZ test were 
obtained using the CvM* statistic. A nominal level of a = 0.05 was maintained 
by first testing reflection symmetry at a = 0.025 then testing complete symmetry 
at a = 0.025 if the hypothesis of reflection symmetry was not rejected. For the 
GG test, the moving window dimensions were 3x2 (width, height) and 5x3 for 
the parent grids of 18 x 12 and 25 x 15, respectively. 

For the simulations in Table 6 comparing the GU (Guan et ah, 2004) and 
MS (Maity and Sherman, 2012) tests, data were simulated at random, uniformly 
distributed sampling locations on 10 x 16 and 10 x 20 sampling domains. The 
lag set. A, used for both tests is given in (3.2) with A matrix (3.3), and the p- 
values for both methods were obtained using the asymptotic X 2 distribution. For 


ixa,ya) = ix,y) 


cos 9 sin 9 
— sin 9 cos 9 


imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6'^'' November, 2015 







2 


ZACHARY D. WELLER 


Contours of Equal Correlation (0.65) 
Effective Range = 6, Geometric Anisotropy 

Correlogram R = 0, 0 = 0 
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R = V2,0 = 3jt/8 R = 2,0 = 3it/8 



X lag X lag 

Fig 5: Correlogram and contours of equal correlation for the covariance models 
used in the simulation study. 
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semivariogram estimates in GU, we use independent (product) Gaussian (normal) 
kernels with a truncation parameter of 1.5. The bandwidth for the Gaussian kernel 
for smoothing over lags on the entire field and on moving windows was chosen 
as w = 0.75. We used the empirical bandwidth and the product Epanechnikov 
kernel given in Maity and Sherman (2012) to implement the MS test. For both 
tests, a grid with spacing of 1 was laid on the sampling region. Using this grid, 
the moving window dimensions for the GU test were 4x2 and the block size for 
the MS test were 4x2. For the MS test, B = 100 resamples using the GBBB 
were used to estimate the asymptotic variance-covariance matrix. 

For the results in Tables 7 - 9, we simulated mean 0, Gaussian RFs with ex¬ 
ponential covariance function with no nugget, a sill of one, and medium effective 
range (^ = 6). Sampling locations were generated randomly and uniformly over a 
16 X 10 sampling domain. We use the lag set and A matrix from 3.2 and 3.3, re¬ 
spectively, unless otherwise noted. All tests were performed using a nominal level 
of a = 0.05. For the GU tests, we use product Gaussian kernels with a truncation 
parameter of 1.5. For the MS tests, we use the default product Epanechnikov 
kernels with empirical bandwidth specified in Maity and Sherman (2012). 

The simulation results in Table 7 demonstrate the effects of changing the set of 
lags for the GU and MS tests. For these simulations, the lag set labeled “normal” 
corresponds to the lag set given in (3.2). The lag set labeled “long” represents 
the lags in (3.2) multiplied by 2.5. Finally, the lag set labeled “more” stands 
for the lags in (3.2) with the additional pair of lags {hs = (1.132, 0.469), hg = 
(—0.469,1.132)}. The lags hg and hg are a pair of lags the create approximate 
22.5° and 112.5° angles, respectively, with the x-axis (counter-clock wise rotation) 
and have Euclidean length of approximately 1.22. These were chosen to supple¬ 
ment the lag pairs (hi, 112 ) which have unit length and create 0° and 90° angles 
with the x-axis and (hg, h 4 ) which have length \/2 ~ 1.41 and create 45° and 135° 
angles with the x-axis. The lag sets are plotted in Figure 6. The A matrix for the 
“more” lagset was constructed as in (3.3), where orthogonal lags are contrasted. 
The p-values were calculated using the asymptotic distribution with degrees 
of freedom based on the number of pairs of lags contrasted. For the GU method, 
we used a bandwidth of 0.75. The moving window dimensions were 4x2. For 
the MS method, we chose block dimensions of 4 x 2 and used B = 75 resamples 
using the GBBB to estimate the asymptotic variance-covariance matrix. Table 
8 demonstrates the effects of changing the block size for the GU and MS tests. 
For these simulations, the labels “small”, “normal”, and “large” correspond to 
moving windows/blocks of size 3 x 2, 4 x 2, and 5x3, respectively. Because we 
simulated n = 300 uniformly distributed sampling locations on a 16 x 10 domain, 
we expect 1.875 sampling locations per unit area. Thus, we expect nj, = 11.3, 15, 
and 28.1 points per block for the small, normal, and large block sizes, respectively. 
We find that the methods tend to have nominal size when Ub ~ = 17.3. For 

both tests, we used the lags in (3.2), and the blocks are defined by a grid with 
spacing 0.5 placed on the sampling region (i.e., a 4 x 2 window is achieved by 
setting the window dimensions to 8 x 4 in the spTest software). We performed 
the tests using a nominal level of a = 0.05, and the p-values were calculated 
using the asymptotic distribution. For the GU method, we used a bandwidth 
of 0.75. For the MS method, we used B = 100 resamples using the GBBB to es¬ 
timate the asymptotic variance-covariance matrix. Finally, Table 9 demonstrates 
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Fig 6: The lag sets used for the simulations in Table 7. 


the effects of changing the bandwidth for the GU test. We use bandwidths of 
w = 0.65, 0.75, and 0.85. The p-values are calculated using both the asymptotic 
distribution and using a finite sample adjustment similar to the one used by 
Guan et al. (2004) for gridded sampling locations. 
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Table 5 

Empirical size and power for Guan et al. (2004) [denoted GG] and Lu and Zimmerman (2005) 
[denoted LZ[ for 500 realizations of a mean 0 GRF with gridded sampling locations using a 
nominal level of a = 0.05. Gomputational time for each method is also included. 

(a) Sample size of n = 216 gridded sampling 
locations. 


18 cols X 12 rows grid 



effective range 

R 

e 

Method 

3 

6 

12 

0 

0 

GG 

0.05 

0.07 

0.05 

LZ 

0.04 

0.04 

0.08 

V2 

0 

GG 

0.32 

0.42 

0.43 

LZ 

0.06 

0.11 

0.12 

2 

0 

GG 

0.91 

0.92 

0.94 


LZ 

0.14 

0.13 

0.15 

V2 

Stt 

GG 

0.27 

0.31 

0.34 


LZ 

0.14 

0.12 

0.13 

2 

Stt 

GG 

0.77 

0.85 

0.86 


8 

LZ 

0.29 

0.33 

0.33 

Computational Time for 1 Test 



GG 


1.11 seconds 



LZ 


1.45 seconds 


(b) Sample size of n = 375 gridded sampling 
locations. 


25 cols X 15 rows grid 



effective range 

R 

e 

Method 

3 

6 

12 

0 

0 

GG 

0.05 

0.06 

0.07 

LZ 

0.06 

0.07 

0.07 

V2 

0 

GG 

0.63 

0.61 

0.69 

LZ 

0.07 

0.09 

0.10 

2 

0 

GG 

0.98 

0.99 

0.99 


LZ 

0.14 

0.16 

0.15 

V2 

Stt 

GG 

0.47 

0.55 

0.55 


LZ 

0.16 

0.19 

0.18 

2 

Stt 

GG 

0.97 

0.99 

0.98 


8 

LZ 

0.37 

0.43 

0.45 

Computational Time for 1 Test 



GG 


7.29 seconds 



LZ 


4.99 seconds 
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Table 6 

Empirical size and power for Guan et al. (2004) [denoted GU] and Maity and Sherman (2012) 
[denoted MS[ for 200 realizations of a mean 0 GRF with uniformly distributed sampling 
locations using a nominal level of a = 0.05. Gomputational time for each method is also 

included. 

(a) Sample size of n = 300 uniformly dis¬ 
tributed sampling locations. 


10 height X 16 width domain 



effective range 

R 

6 

Method 

3 

6 

12 

0 

0 

GU 

0.02 

0.04 

0.05 

MS 

0.04 

0.05 

0.04 


0 

GU 

0.15 

0.20 

0.27 

MS 

0.10 

0.09 

0.08 

2 

0 

GU 

0.43 

0.57 

0.62 


MS 

0.21 

0.16 

0.15 


Stt 

GU 

0.12 

0.13 

0.16 


MS 

0.08 

0.07 

0.04 

2 

Stt 

GU 

0.37 

0.51 

0.51 


8 

MS 

0.27 

0.23 

0.21 

Computational Time for 1 Test 



GU 


2.17 seconds 



MS 


83.40 seconds 


(b) Sample size of n = 450 uniformly dis¬ 
tributed sampling locations. 


10 height X 20 width domain 



effective 

range 

R 

e 

Method 

3 

6 

12 

0 

0 

GU 

0.00 

0.04 

0.05 

MS 

0.05 

0.07 

0.03 

y/2 

0 

GU 

0.15 

0.22 

0.23 

MS 

0.07 

0.06 

0.07 

2 

0 

GU 

0.57 

0.68 

0.75 


MS 

0.32 

0.18 

0.14 

y/2 

Stt 

GU 

0.09 

0.18 

0.21 


MS 

0.12 

0.06 

0.08 

2 

Stt 

GU 

0.55 

0.58 

0.65 


8 

MS 

0.37 

0.23 

0.21 


Computational Time for 1 

Test 



GU 


4.44 seconds 



MS 


162.35 seconds 


Table 7 

Effects of changing the lag set. Empirical size and power for Guan et al. (2004) [denoted GU[ 
and Maity and Sherman (2012) [MS[ for 100 realizations of a mean 0 GRE with n = 400 
uniformly distributed sampling locations. The label “normal” corresponds to the lag set in (3.2), 
while “long” represents using longer lags, and “more” denotes using more lags (see Figure 6). 


16 width X 10 height domain 



Lag Set 

R 

6 

Method 

normal 

long 

more 

0 

0 

GU 

0.02 

0.00 

0.01 

MS 

0.03 

0.14 

0.03 


Stt 

GU 

0.19 

0.07 

0.16 


MS 

0.11 

0.24 

0.07 

2 

Stt 

GU 

0.56 

0.17 

0.40 


8 

MS 

0.27 

0.33 

0.21 
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Table 8 

Effects of changing the window/block size. Empirical size and power for Guan et al. (2004) 
[denoted GU] and Maity and Sherman (2012) [MS] for 200 realizations of a mean 0 GRF with 
n = 300 uniformly distributed sampling locations. The label “normal” corresponds to the 
window/block size of 4. x 2, while “small” represents using a smaller window, and “large” 

denotes using a larger window. 


16 width X 10 height domain 



Window/Block Size 

R 

e 

Method 

small 

normal 

large 

0 

0 

GU 

0.06 

0.04 

0.01 

MS 

0.03 

0.04 

0.02 


0 

GU 

0.17 

0.17 

0.08 

MS 

0.06 

0.09 

0.09 

2 

0 

GU 

0.56 

0.53 

0.22 


MS 

0.17 

0.17 

0.18 


Table 9 

Effects of changing bandwidth. Empirical size and power for Guan et al. (2004) [denoted GU[ 
for 100 realizations of a mean 0 GRF with n = 400 uniformly distributed sampling locations 

using a nominal level of a = 0.05. 

(a) P-value: asymptotic distribution 


16 width X 10 height domain 



Effective Range 

R 

e 

Bandwidth 

3 

6 

12 

0 

0 

0.65 

0.00 

0.00 

0.00 

0.75 

0.03 

0.06 

0.04 



0.85 

0.06 

0.11 

0.16 

V2 

Stt 

0.65 

0.01 

0.01 

0.08 


0.75 

0.08 

0.14 

0.24 



0.85 

0.14 

0.27 

0.35 

2 

Stt 

0.65 

0.21 

0.22 

0.25 


8 

0.75 

0.50 

0.54 

0.67 



0.85 

0.70 

0.73 

0.81 


(b) P-value: finite sample 


16 width X 10 height domain 



Effective Range 

R 

e 

Bandwidth 

3 

6 

12 

0 

0 

0.65 

0.02 

0.03 

0.02 

0.75 

0.03 

0.06 

0.06 



0.85 

0.07 

0.10 

0.09 


Stt 

0.65 

0.05 

0.06 

0.20 


0.75 

0.09 

0.18 

0.29 



0.85 

0.11 

0.24 

0.31 

2 

Stt 

0.65 

0.37 

0.38 

0.53 


8 

0.75 

0.55 

0.58 

0.69 



0.85 

0.63 

0.64 

0.76 
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